152 research outputs found

    Clustering as an example of optimizing arbitrarily chosen objective functions

    Get PDF
    This paper is a reflection upon a common practice of solving various types of learning problems by optimizing arbitrarily chosen criteria in the hope that they are well correlated with the criterion actually used for assessment of the results. This issue has been investigated using clustering as an example, hence a unified view of clustering as an optimization problem is first proposed, stemming from the belief that typical design choices in clustering, like the number of clusters or similarity measure can be, and often are suboptimal, also from the point of view of clustering quality measures later used for algorithm comparison and ranking. In order to illustrate our point we propose a generalized clustering framework and provide a proof-of-concept using standard benchmark datasets and two popular clustering methods for comparison

    La Termogènesi als calorímetres per conducció: les transformacions sòlid-sòlid i les barreges líquides

    Get PDF
    Descrivim alguns mètodes d'obtenció de funcions de transferència associades a fenòmens reals i donem exemples de les termogènesis obtingudes en aquests casos. Els calorimètres amb molt bones característiques dinàmiques (θn∼3Hz) són molt adequats per a l'estudi de fenòmens transitoris. En aquest treball presenten en primer lloc resultats relatius a la transformació β → γ' de l'aliatge Cu- Zn-Al. La transformació presenta un caràcter molt discontinu, una dissipació energètica important, i una excel•lent correlació amb l'emissió acústica generada durant el procés de transformació que permet donar una valoració qualitativa de les possibilitats calorimètriques de l'anàlisi entàlpica diferencial. En segon lloc presentem una anàlisi de les entalpies d'excés en les barreges líquides. Aquest estudi és molt interessant a baixes concentracions. L'ús de sistemes d'injecció permet assolir fraccions molars de solut xs\gtrsim 0.01. L'obtenció d'una funció de transferència correcta del sistema calorimètric i l'ús d'algorismes deconvolutius eficaços permet reduir la fracció molar a xs\gtrsim 0.001.This paper presents several methods to obtain transfer functions associated with power dissipations in actual phenomena and a few examples of the approximate thermogenesis obtained. On the one hand, calorimeters with extremely good dynamic characteristics (θn∼3Hz) allow the study of structural transformations in solids. We present results concerning the martensitic transformation β → γ' of a Cu-Zn-Al alloy. They show the jerky character of the transformation very well correlated with acoustic emission patterns and an important energy 1iberation. This analysis gives an estimate of the posibilities of calorimetry within the field of Differential Enthalpic Analysis. On the other hand, an analysis of the properties of liquid mixtures at low concentrations is very interesting when carried out their excess enthalpies. Steady injection systems allow to reach solute molar fractions xs\gtrsim 0.01. We describe here the obtention of a correct transfer function. Now, the application of proper deconvolutive algorithms make it possible to work at so low concentrations as xs\gtrsim 0.001

    Generalized hard cluster analysis

    Full text link

    Separation of poliovirus and poliovirus RNA on Sephadex G 200

    Full text link
    Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/41675/1/705_2005_Article_BF01241426.pd

    Partitioning clustering algorithms for protein sequence data sets

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome-sequencing projects are currently producing an enormous amount of new sequences and cause the rapid increasing of protein sequence databases. The unsupervised classification of these data into functional groups or families, clustering, has become one of the principal research objectives in structural and functional genomics. Computer programs to automatically and accurately classify sequences into families become a necessity. A significant number of methods have addressed the clustering of protein sequences and most of them can be categorized in three major groups: hierarchical, graph-based and partitioning methods. Among the various sequence clustering methods in literature, hierarchical and graph-based approaches have been widely used. Although partitioning clustering techniques are extremely used in other fields, few applications have been found in the field of protein sequence clustering. It is not fully demonstrated if partitioning methods can be applied to protein sequence data and if these methods can be efficient compared to the published clustering methods.</p> <p>Methods</p> <p>We developed four partitioning clustering approaches using Smith-Waterman local-alignment algorithm to determine pair-wise similarities of sequences. Four different sets of protein sequences were used as evaluation data sets for the proposed methods.</p> <p>Results</p> <p>We show that these methods outperform several other published clustering methods in terms of correctly predicting a classifier and especially in terms of the correctness of the provided prediction. The software is available to academic users from the authors upon request.</p

    Multiple Deprivation, Severity and Latent Sub-Groups:Advantages of Factor Mixture Modelling for Analysing Material Deprivation

    Get PDF
    Material deprivation is represented in different forms and manifestations. Two individuals with the same deprivation score (i.e. number of deprivations), for instance, are likely to be unable to afford or access entirely or partially different sets of goods and services, while one individual may fail to purchase clothes and consumer durables and another one may lack access to healthcare and be deprived of adequate housing . As such, the number of possible patterns or combinations of multiple deprivation become increasingly complex for a higher number of indicators. Given this difficulty, there is interest in poverty research in understanding multiple deprivation, as this analysis might lead to the identification of meaningful population sub-groups that could be the subjects of specific policies. This article applies a factor mixture model (FMM) to a real dataset and discusses its conceptual and empirical advantages and disadvantages with respect to other methods that have been used in poverty research . The exercise suggests that FMM is based on more sensible assumptions (i.e. deprivation covary within each class), provides valuable information with which to understand multiple deprivation and is useful to understand severity of deprivation and the additive properties of deprivation indicators

    Addressing preference heterogeneity in public health policy by combining Cluster Analysis and Multi-Criteria Decision Analysis: Proof of Method.

    Get PDF
    The use of subgroups based on biological-clinical and socio-demographic variables to deal with population heterogeneity is well-established in public policy. The use of subgroups based on preferences is rare, except when religion based, and controversial. If it were decided to treat subgroup preferences as valid determinants of public policy, a transparent analytical procedure is needed. In this proof of method study we show how public preferences could be incorporated into policy decisions in a way that respects both the multi-criterial nature of those decisions, and the heterogeneity of the population in relation to the importance assigned to relevant criteria. It involves combining Cluster Analysis (CA), to generate the subgroup sets of preferences, with Multi-Criteria Decision Analysis (MCDA), to provide the policy framework into which the clustered preferences are entered. We employ three techniques of CA to demonstrate that not only do different techniques produce different clusters, but that choosing among techniques (as well as developing the MCDA structure) is an important task to be undertaken in implementing the approach outlined in any specific policy context. Data for the illustrative, not substantive, application are from a Randomized Controlled Trial of online decision aids for Australian men aged 40-69 years considering Prostate-specific Antigen testing for prostate cancer. We show that such analyses can provide policy-makers with insights into the criterion-specific needs of different subgroups. Implementing CA and MCDA in combination to assist in the development of policies on important health and community issues such as drug coverage, reimbursement, and screening programs, poses major challenges -conceptual, methodological, ethical-political, and practical - but most are exposed by the techniques, not created by them
    corecore